Deep-structured hidden conditional random fields for phonetic recognition
نویسندگان
چکیده
We extend our earlier work on deep-structured conditional random field (DCRF) and develop deep-structured hidden conditional random field (DHCRF). We investigate the use of this new sequential deep-learning model for phonetic recognition. DHCRF is a hierarchical model in which the final layer is a hidden conditional random field (HCRF) and the intermediate layers are zero-th-order conditional random fields (CRFs). Parameter estimation and sequence inference in the DHCRF are developed in this work. They are carried out layer by layer so that the time complexity is linear to the number of layers. In the DHCRF, the training label is available only at the final layer and the state boundary is unknown. This difficulty is addressed by using unsupervised learning for the intermediate layers and lattice-based supervised learning for the final layer. Experiments on the standard TIMIT phone recognition task show small performance improvement of a three-layer DHCRF over a two-layer DHCRF; both are significantly better than the single-layer DHCRF and are superior to the discriminatively trained tri-phone hidden Markov model (HMM) using identical input features.
منابع مشابه
Combining phonetic attributes using conditional random fields
A Conditional Random Field is a mathematical model for sequences that is similar in many ways to a Hidden Markov Model, but is discriminative rather than generative in nature. Here we explore the application of the CRF model to ASR processing by building a system that performs first-pass phonetic recogintion using discriminatively trained phonetic attributes. This system achieves an accuracy le...
متن کاملDiscriminative Phonetic Recognition with Conditional Random Fields
A Conditional Random Field is a mathematical model for sequences that is similar in many ways to a Hidden Markov Model, but is discriminative rather than generative in nature. In this paper, we explore the application of the CRF model to ASR processing of discriminative phonetic features by building a system that performs first-pass phonetic recognition using discriminatively trained phonetic f...
متن کاملHidden Conditional Neural Fields for Continuous Phoneme Speech Recognition
In this paper, we propose Hidden Conditional Neural Fields (HCNF) for continuous phoneme speech recognition, which are a combination of Hidden Conditional Random Fields (HCRF) and a MultiLayer Perceptron (MLP), and inherit their merits, namely, the discriminative property for sequences from HCRF and the ability to extract non-linear features from an MLP. HCNF can incorporate many types of featu...
متن کاملAttribute-based Mandarin speech recognition using conditional random fields
Integrating phonetic knowledge into a speech recognizer is a possible way to further improve the performance of conventional HMM-based speech recognition methods. This paper presents a cascaded architecture which consists of attribute detection and conditional random field to make use of phonetic knowledge within the phone decoding process. The attribute detection can be implemented by using an...
متن کاملSegmental conditional random fields with deep neural networks as acoustic models for first-pass word recognition
Discriminative segmental models, such as segmental conditional random fields (SCRFs), have been successfully applied to speech recognition recently in lattice rescoring to integrate detectors across different levels of units, such as phones and words. However, the lattice generation has been constrained by a baseline decoder, typically a frame-based hybrid HMMDNN system, which still suffers fro...
متن کامل